29 research outputs found

    Finding Similarities between Structured Documents as a Crucial Stage for Generic Structured Document Classifier

    Get PDF
    One of the addressed problems of classifying structured documents is the definition of a similarity measure that is applicable in real situations, where query documents are allowed to differ from the database templates. Furthermore, this approach might have rotated [1], noise corrupted [2], or manually edited form and documents as test sets using different schemes, making direct comparison crucial issue [3]. Another problem is huge amount of forms could be written in different languages, for example here in Malaysia forms could be written in Malay, Chinese, English, etc languages. In that case text recognition (like OCR) could not be applied in order to classify the requested documents taking into consideration that OCR is considered more easier and accurate rather than the layout  detection. Keywords: Feature Extraction, Document processing, Document Classification

    Evacuation routing optimizer (EROP) / Azlinah Mohamed … [et_al.]

    Get PDF
    This report presents the solution to the two of the most critical processes in planning for flash Hood evacuation: the evacuation vehicle assignment problem (EVAP) and the evacuation vehicle routing problem (EVRP). With these solutions, the evacuation routing optimizer (EROP) is constructed. The EVAP is firstly solved, followed by the EVRP. For EVAP, discrete particle position is proposed to support the implementation of discrete particle swarm optimization called myDPSOVAP-A. Particle positions are initially calculated based on the average passenger capacity of each evacuation vehicle. We experiment with different numbers of the potential flooded areas (PFA) using two types of sequences for vehicle capacity; random and sort ascending order. Both of these sequences are tested with different inertia weights, constriction coefficients (CF), and acceleration coefficients. We analyse the performance of each vehicle allocation in four experiment categories: myDPSOVAP-A using inertia weight with random vehicle capacity, myDPSOVAP-A using inertia weight with sort ascending order of vehicle capacity; myDPSOVAP-A using CF with random vehicle capacity, and myDPSOVAP-A using CF with sort ascending of vehicle capacity. Flash flood evacuation datasets from Malaysia are used in the experiment. myDPSOVAP-A using inertia weight with random capacity was found to give the best results for both random and sort ascending order of vehicle capacity. Solutions reached by analyses with CF random and inertia weight sorted in ascending order were shown to be competitive with those obtained using inertia weight with random capacity. Overall, myDPSOVAP-A outperformed both a genetic algorithm with random vehicle capacity and a genetic algorithm with sort ascending order of vehicle capacity in solving the EVAP. Consequently EVRP, myDPSOVRPl is modified and named as myDPSO_VRP_2, adopts a new solution mapping which incorporates a graph decomposition and random selection of priority value. The purpose of this mapping is to reduce the searching space of the particles, leading to a better solution. Computational experiments involve EVRP dataset from road network for flash flood evacuation in Johor State, Malaysia. The myDPSOVRPl and myDPSO_VRP_2 are respectively compared with a genetic algorithm (GA) using solution mapping for EVRP. The results indicate that the proposed myDPSO_VRP_2 are highly competitive and show good performance in both fitness value and processing time. Overall, DPSOVRP2 and myDPSOVAP-A which are the main component in the EROP gave good performance in maximizing the number of people to vehicles and minimizing the total travelling time from vehicle location to PFA. EROP was embedded with the DPSOVRP2 and retrieved the generated capacitated vehicles from the myDPSOVAP-A. EROP is also accommodated with the routing of vehicles from PFA to relief centres to support the whole processes of the evacuation route planning

    Feng Shui Garden adviser System (FengShuiGAS)

    Get PDF
    This paper explores an approach to building an adaptive expert system prototype in an environment of human-computer collaboration. Components of an adaptive system are identified, with an emphasis on the mechanisms that enable adaptive behavior to occur.An adaptive expert system is necessary in order to communicate with the user and also adapts to user’sneeds. The adaptive expert system in this particular project is implemented on a Feng Shui garden design domain.A frame-based data representation and rule-based approach is applied to this project. In this research, the Feng Shui aspiration is adapted to users’ assessment and choice based on their preferences. This experimental expert system prototype displays low level adaptive capabilities that show sufficient promise to warrant further research

    Evolution of Information Systems in Malaysia

    Get PDF
    Abstract. Malaysia has taken a gigantic decision by transforming itself from industrialization into an unknown territory of knowledge economy. Therefore it is important to establish a test bed to justify on the successfulness of leapfrogging onto this new bandwagon. Multimedia Super Corridor (MSC) project is the major project in the country and has been established to test this. The question is, can Malaysia achieve its dream in creating successful implementation of information systems when information systems history was full of catastrophe? This paper will discuss the findings of mapping triangle of dependencies model by Chris Sauer in order to foresee whether Malaysia has the potential to achieve successful implementation of information systems. Finally, through this investigation, we are able to outlined external influences that can nurture the continuity of information systems dependencies in Malaysia and embedded it as external factors of Sauer's model

    Survey on highly imbalanced multi-class data

    Get PDF
    Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data

    Conceptual framework on information security risk management in information technology outsourcing / Nik Zulkarnaen Khidzir, Noor Habibah Arshad and Azlinah Mohamed

    Get PDF
    Data security and protection are seriously considered as information security risk for information asset in IT outsourcing (ITO). Therefore, risk management and analysis for security management is an approach to determine which security controls are appropriate and cost effective to be implemented across organization for ITO to secure data/information asset. However, previous established approach does not extensively focus into information security risk in ITO. For that reason, a conceptual framework on information security risk management in IT outsourcing (ISRM-ITO) will be introduced throughout this paper. An extensive amount of literature review on fundamental concepts, theoretical background and previous findings on information security risk management and ITO had been conducted. Throughout the review, theoretical foundation and the process that lead to success in managing information security risk ITO were identified and these findings become a key component in developing the conceptual framework. ISRM-ITO conceptual framework consists of two layers. The first layer concentrates on information security risks identification and analysis before the decision is made to outsource it. The second layer will cover the approach of information security risk management which is used to analyze, mitigate and monitor risks for the rest of the ITO lifecycle. Proposed conceptual framework could improve organization practices in information security study for IT outsourcing through the adoption of risk management approach. Finally, an approach to determine a cost effective security control for information security risk can be implemented successfully in the ITO cycle

    Rawatan psikoterapi melalui kaedah tazkiyah al-nafs oleh Syeikh Abdul Qadir Al-Mandili dalam kitab penawar bagi hati

    Get PDF
    Konsep penyucian hati (tazkiyah al-nafs) yang disebut juga sebagai tazkiyah al-qalb, merupakan intipati penting dalam perbincangan Kitab Penawar Bagi Hati yang disusun oleh Syeikh Abdul Qadir al-Mandili. Beliau mengemukakan konsep-konsep penyucian hati dalam tiga komponen utama iaitu pertama kawalan dan pencegahan kerosakan tujuh anggota yang zahir, kedua rawatan dan rehabilitasi sifat-sifat yang dicela (mazmumah), dan terakhir suntikan penerapan sifat-sifat yang dipuji (mahmudah).Makalah ini akan memperkenalkan kaedah tazkiyah al-nafs yang telah digarap secara sempurna oleh Syeikh Abdul Qadir al-Mandili sebagai penyelesaian penyakit yang berpunca daripada hati

    WORD SENSE DISAMBIGUATION USING FUZZY SEMANTIC-BASED STRING SIMILARITY MODEL

    Get PDF
    Sentences are the language of human communication. This communication medium is so fluid that words and meaning can have many interpretations by readers. Besides, a document that consists of thousands of sentences would be tough for the reader to understand the content. In this case, computer power is required to analyse the gigantic batch size of the text. However, there are several arguments that actively discuss regarding the output generated by a computer toward the meaning of the passage in terms of accuracy. One of the reasons for this issue is the existing of the ambiguous word with multiple meanings in a sentence. The passage might be incorrectly translated due to wrong sense selection during the early phase of sentence translation. Translating sentence in this paper means either the sentence has a negative or positive meaning. Thus, this research discusses on how to disambiguate the term in a sentence by referring to the Wordnet repository by proposing the use of fuzzy semantic-based similarity model. The proposed model promising to return a good result for detecting the similarity of two sentences that has been proven in the past research. At the end of this paper, preliminary result which shows the flow of how the proposed framework working is discussed

    Comparative study of apriori-variant algorithms

    Get PDF
    Big Data era is currently generating tremendous amount of data in various fields such as finance, social media, transportation and medicine. Handling and processing this “big data” demand powerful data mining methods and analysis tools that can turn data into useful knowledge. One of data mining methods is frequent itemset mining that has been implemented in real world applications, such as identifying buying patterns in grocery and online customers’ behavior.Apriori is a classical algorithm in frequent itemset mining, that able to discover large number or itemset with a certain threshold value. However, the algorithm suffers from scanning time problem while generating candidates of frequent itemsets.This study presents a comparative study between several Apriori-variant algorithms and examines their scanning time.We performed experiments using several sets of different transactional data.The result shows that the improved Apriori algorithm manage to produce itemsets faster than the original Apriori algorithm
    corecore